# A tibble: 2 × 1
`PCOS dimensions`
<int>
1 541
2 44
Polycystic ovary syndrome (PCOS) is a syndrome documented in women in their menstruating ages
Documented symptoms are often; period pains, irregular periods, ovary related problems and hormone imbalance
Patients with PCOS often have problems with pregnancy and potential complication with/in pregnancy
However, it is still not verified what the cause of PCOS is.
The aim of this study is to examine a data set (found on Kaggle) of patients with and without PCOS. The data set has been made in India and data comes from 10 different hospitals.
Raw data:
541 observations divided into 45 variables
01_load_data:
Simply loads the data
02_clean_data:
Unit changes ( inch to cm)
Rounding & grouping BMI
Change Blood type and cycles from numeric values to characters
Create new column for cycle/ pregnancy stage
Merging data frame into one file
# Rounding of BMI and dividing into categories
body_measurements <- body_measurements |>
mutate(BMI = round(BMI, 1)) |>
mutate(BMI_class = case_when(
BMI < 18.5 ~ "Underweight",
BMI <= 18.5 | BMI < 25 ~ "Normal weight",
BMI <= 25 | BMI < 30 ~ "Overweight",
BMI >= 30 ~ "Obesity")) |>
mutate(BMI_class = factor(BMI_class,
levels = c("Underweight",
"Normal weight",
"Overweight",
"Obesity"))) |>
relocate(BMI_class, .after = BMI)Information on the PCOS dataset
Dimensions:
# A tibble: 2 × 1
`PCOS dimensions`
<int>
1 541
2 44
Count of how many have PCOS:
# A tibble: 2 × 2
PCOS_diagnosis n
<chr> <int>
1 No 364
2 Yes 177
In this analysis, we have been looking at the correlation between PCOS diagnosed patients, and what factors they potentially have in common from the body measurements data.
her
her
her